The asymptotic number of rooted trees in a tree* 
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Abstract 

Let Tn be the set of trees with n vertices. Suppose that each tree in Tn is 

■ equally likely. We show that the number of rooted trees corresponding to any 
given tree equals (/i^ + o(l))n in almost every tree of Tn-, where is a constant. 
As an application, we show that the number of any given pattern in Tn is also 
asymptotically normally distributed with mean ~ /iA/n and variance ~ CTMn, 
where // m , <^M are some constants related to the given pattern. 

■ Keywords: tree; rooted tree; pattern; generating function; limiting distribution; 
automorphism 
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> 

a^■ 1 Introduction 

^ ■ A pattern is a given small tree. We say that M. occurs in a tree T if is a subtree 
<N ■ of T in the sense that the degree of each internal vertex (of degree more than one) of 
M. matches the degree of the corresponding vertex in T, while each external vertex (of 
^ . degree one) of M. matches a vertex of T with an arbitrary degree. Occasionally, we say 
H I that the pattern is in a tree instead of that the pattern occurs in a tree for abbreviate. 

Let Tn be the set of trees with n vertices. If we use Xn.u to denote the number of a 
given pattern Ai in Tn, then Xn^M is a random variable with probability 

Pr(X„,M = A;) = ^, 

i^n 

where tn,k denotes the number of such trees in Tn that the number of pattern A4 in 
each of the trees is k, and t„ = 17^1. 
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Moreover, let TZn be the set of rooted trees. We can also consider the number of a 
given pattern in TZ^. We still use Xn^M to denote the random variable in TZn, which 
would not make any ambiguity. 

The main work of this paper is to show that some random variable satisfies 

^yVar{Yn) 

where A/'(0, 1) is the random variable with standard normal distribution and -^^ means 
weak convergence. We then call this Yn asymptotically normal. Moreover, if 



an 



then Yn is asymptotically normal with mean ~ /in and variance ~ an. We refer the 
readers to [7] for details. 

In fact, it was shown in [2] that in TZn the number Xn^M of any given pattern is 
asymptotically normal with mean ~ fiM^ and variance ~ crjv^n, where /ijv/ and 
are some constants corresponding to the given pattern. But, for the set Tn there is 
no such a result on normal distribution. In [6], the authors proved that for any given 
pattern in Tn, the limiting distribution has a density {A + Bt'^)e'"^^ , where A, B, C are 
some constants. The mean and variance of the number of any given pattern are still 
asymptotically fiAjn and aM^ where the constants are the same as in TZn. Clearly, if 
one shows that B = 0, then the distribution is normal. For some special patterns, such 
as a star (or a node with a given degree) pattern a double-star pattern [8], and a 
path pattern [7], the corresponding limiting distributions were proved to be normal. 
For some previous work we refer to Robinson and Schwenk [TT]. For more details, 
we refer to [21 |6l [71 |Tl]. However, for any given pattern Kok in his thesis [7] claimed 
that it seems much more difficult to demonstrate the normality. In this paper, we will 
solve this problem from a new point of view which is different from the existing ones. 
We study the number of rooted trees in a tree and get that for almost every tree the 
number of corresponding rooted trees is {fir + o(l))n. Then, for any given pattern Ai, 
since we already knew that the number of patten Ai in is asymptotically normal, 
as a consequence, in 7^ the limiting distribution for Ai is also normal. 

We organize this paper as follows. In Section 2, we will introduce some basic 
knowledge that will be used in our proofs. In Section 3, we will present the detailed 
proofs. We concentrate on the number of rooted trees in a tree. Section 4 is devoted 
to study the limiting distribution for any given pattern. 
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2 Preliminaries 



Analogous to patterns, for each tree we use Xn to denote the number of corresponding 
rooted trees. Clearly, Xn is also a random variable in 7^. We introduce the following 
two functions: 

t{x) = 

n>l 

t{x,u)= tn,fca;V, 

n>l,fc>0 

where the coefficient t„ ^ denotes the number of such trees in Tn that each of the trees 
has k rooted trees. Clearly, '^}.yQtn,k = tn- We always assume that every tree of Tn is 
equally hkely. Then, Pr(X„ = k) = ^f. 

Let Tn be a tree in Tn- An automorphism $ of T„ is defined as 

where Vi and vj are any two vertices in T„ and ViVj is an edge of T„ joining vertices Vi 
and Vj. We call that two vertices u and v of T„ are in the same vertex class if u can 
be mapped to v by some automorphism. Clearly, this sets up an equivalent relation on 
the vertex set of Tn, and hence the vertices in T„ are partitioned into some classes. If 
we designate every vertex in a same vertex class to be the root, we shall get the same 
rooted tree. Exactly to say, suppose that Vi is mapped to Vj under an automorphism $ 
of Tn- One can illustrate that the automorphism $ is also an isomorphism that maps 
the tree rooted at Vi to the tree rooted at Vj just by definition. 

Hence, the number of rooted trees in a tree is exactly the number of vertex classes 
in the tree. Then, let X„ also represents the number of vertex classes in the tree under 
automorphisms. Therefore, we can introduce Xn on the rooted tree space TZn- 

If we consider X„ in TZn, we also suppose that each tree in TZn is equally likely. 
We can define similar generating functions on TZn, and let r{x), r{x,u) be the related 
functions, respectively. One can see that r{x, 1) = r(x). Suppose 

r{x,u)= ^ rn,kX^v!', 

n>l,fc>0 

where r„ ^ is the number of rooted trees in TZn that have k vertex classes. It follows 
that in TZn, 

Pr(X„ = A;) = ^, 

Tn 

where r„ = YlZA- 
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We should notice that when we count the number of vertex classes in a rooted tree, 
the root itself always forms a class with a single vertex, since any automorphism on a 
rooted tree must map the root to itself. That is a bit different from the case for trees. 

Furthermore, suppose that the radius of convergence of r{x) is xq- Otter [9] showed 
that Xq satisfies that r(xo) = 1 and the asymptotic expansion of r(x) is 

r{x) = 1 - bi{xo - + b2{xo - x) + bs^xo - xf^^ + ■ ■ ■ , (1) 

where Xq ~ 0.3383219 and bi ~ 2.6811266. And, t{x) has a similar expansion, namely, 

t{x) = Co + Ci{xq - x) + C2{xo - x)^/^ H . (2) 

Applying approximate theory [10] on Eqs.([T]) and ([2]), we get that 

Cx^- 



^3/2 ' 

where C and D are some constants. We refer the readers to [TT]. It has been showed 
that C = 0.5349 . . . and D = 0.4399 . . .. The approximate theory used in [11] is due to 
Polya. The original paper [TU] gives us more details. 

In what follows, we first investigate X„ in 7^„. To start with, we need the following 
two lemmas. We refer the readers to [31 [7] for detailed information. 

Lemma 1. Suppose that F{x,y,u) is an analytic function around at {xQ,yQ,l) such 
that F{xo,yo, 1) = yo, Fyg{xo,yo, 1) = 1, Fyy{xQ,yo, 1) 7^ and F^(a:o,i/o, 1) 7^ 0. Then 
there exist a neighborhood Uq o/(xo,1), a neighborhood Ui of y^ and analytic functions 
g{x,u), h{x,u) and f{u) which are defined on Uq such that the only solutions y E Ui 
with y = F{x, y, u) and (x, u) G Uq are given by y{x, u) = g{x, u) + h{x; u) ^1 — j^. 

Furthermore, g{xo,l) =yo and h{xo, 1) = \f^^^^^^- □ 

Lemma 2. Let y{x, u) denote the function defined on a neighborhood U of [xq, 1), and 
y{x,u) = F{x,y{x,u),u) = g{x, u) + h{x, u)^l- j^. Ify{x,l) is aperiodic, i.e., if 

from y{x, 1) = 1) with some power series y{x, u) follows that d = 1 and all the 

Taylor coefficients of Fy{x,y,u) are non-negative, then there exists an r] > such that 
y{x,u) can be analytically continued in 

U = {{x,u) : \x\ < Xo + ?7, l^^l < 1 + ?7, arg{x — f{u)) 7^ 0, x 7^ /(^)}- 

Moreover, ify{x,u) = '^yn,kx'^u^ = Xlbil"")]^" and yn,k > 0, then 
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And if h{f{l),l) ^ 0, we get that is asymptotically normal with mean ~ fin and 
variance an. □ 



Remark: In [3] and [7], the authors always assumed that all the Taylor coefficients of 
F{x,y,u) are non-negative. But, we can extend this condition in the proofs in [5] and 
[7] to the following one: all the Taylor coefficients of Fy{x, y, u) are non-negative. 

3 The number of rooted trees in a tree 

Now we concentrate on the number of vertex classes of a rooted tree. Recalled that 
an automorphism of a rooted tree must map the root to itself, which is a bit different 
from an automorphism of a tree, namely, the root always forms a vertex class with a 
single vertex. We shall show that X„ is asymptotically normal with mean (yU^ + o{l))n 
and variance (a,. + o(l))n in IZn- 

In what follows, there appears an expression of the form Z*{Sn', f{x, u)) (or Z{Sn] f{x))), 
which is the substitution of the counting series f{x,u) (or f{x)) into the cycle index 
Z{Sn) of the symmetric group S'„. This involves replacing each variable Sj in Z{Sn) by 



f{x\u) (or /(x*)). For instance, if n = 3, then Z^S^) = (l/3!)(sf + 3siS2 + 2S3) and 



ZiS^-Jix)) = (l/3!)(/(x)3 + 3/(x)/(x2) + 2/(x3)), Z*(^3; /(x, «)) = (l/3!)(/(x, w)^ + 



3f{x,u)f{x'^,u^) + 2/(a;^,n^)). We refer the readers to [5] for details. And in [5], it 
was shown that 



The coefficient of x^ in Z{Sn', r(x)) is the number of rooted trees of order p + 1 whose 
roots have degree n. Multiplication of Z{Sn',r{x)) by x corrects the power of x so 
that x^ in xZ{Sn',r{x)) is the number of those trees with p vertices. This expression 
Z{Sn]r{x)) follows from Polya Enumeration Theorem [S]. 

Analogously, we take the same procedure for r{x,u) in this paper. But, here we 
should notice that if the same two copies of a rooted tree with k vertex classes connect 
to a root, then the number of vertex classes of the new rooted tree is k + 1, because 
there is only one new class, i.e., the new root, which is different from the procedure 
for calculating the number of a given pattern. Hence, we use r{x'',u) to denote the 
generating function for k copies of a rooted tree. And we can get 



where the modification term ip{x, u) is a polynomial. 

For instance, suppose that the tree has a root of degree 2. Then, xu-Z*{S2, r{x, u)) = 
xu ■ |(r(x, m)^ + r(x^, u)). We have r„,fc choices to form a rooted tree with the same two 




n>0 



r(x, u) = xu ■ Q^k>^ k 



(3) 
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branches. In r{x,uY the term r„ fcX^^u^'^' denotes the number of such rooted trees. We 
should notice that the power 2k must be corrected into k which means that the number 
of vertex classes is still k. But, in r(x^,M) the term Tn^kX^'^u^ denotes the number of 
those rooted trees with two same branches. Moreover, note that r(x^,M) and r(x^,M^) 
are polynomials, since r(x, m) is a polynomial. Hence, there must be a polynomial 
modification term ip2{Xi u) such that xu ■ Z*{S2, r{x, u)) + ip2{x, u) performs the gener- 
ating function of the trees with roots of degree 2. We can see that for any number of 
branches, we have to modify the function when counting the numbers corresponding to 
the cases that some branches are the same. Hence, in general, expression ([3]) follows. 

Let y = r{x,u), and F{x,y,u) = xu ■ e^'^^i fc*"^^*'"^ + iplxju). Here, the Taylor 
coefficients of x, y and u may not be non-negative. Recall that r(x, 1) = r(x), that is, 
ipi^x, 1) = 0. Recall also that there exists a real number xq such that r(xo) = 1 and xq 
is the convergence radius. 

Then, we have F{xo,y{xo, 1), 1) = 1 = y{xo, 1) = r(xo, 1) and 

Fy{x, y, u) = xu - e^k>i lr{x'-,u)^ 

which implies that Fy{xo,yo, 1) = 1. Moreover, Fyy{xo,yo, 1) 7^ and F^{xo,yo, 1) 7^ 0. 
That is, all the conditions of Lemma [T] hold. Furthermore, we have that the Taylor 
coefficients of Fy{x,y,u) are non-negative, because in Fy all r(x'^,u)'s {k > 2) are 
polynomials and the expression of Fy has an exponential form. Thus, by Lemma [2] we 
have that the random variable Xn is asymptotically normal with mean 

E{Xn) ~ firn {n — )■ 00) 

and variance 

Var{Xn) ~ (Trn {n — )■ 00), 

where yU.^ and are some constants. 

In this paper, we mainly focus on the overall property of a probability space. Fol- 
lowing the book [1], we will say that almost every (a.e.) graph in a graph space Qn 
has a certain property Q if the probability Pr((5) in Qn converges to 1 as tends to 
infinity. Occasionally, we will say almost all instead of almost every. 

From Chebyshev inequality 

VarX 

Pr [\Xn - E{Xn)\ > n^/'] < ^3^/ ^ as n ^ cx), 

it follows that for almost all rooted trees, E{Xn) < X„ < E{Xn) +n^^^, namely, 
Xn = (1 + o{l))E{Xn). We can get that 

Theorem 3. For almost all rooted trees in TZn, the number of vertex classes under 

automorphisms is {fir + o(l))n. □ 
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Therefore, we can study the number of vertex classes in a tree. To get the final 
result, we need another property as follows. We have defined the number of vertex 
classes of a tree. And, we call a vertex fixed if this single vertex forms a class. 

Lemma 4. Almost every tree in 7^ has more than fixed vertices. 

Proof. We prove this result by contradiction. Suppose that 7^' is a subset of 7^ such 
that every tree in 7^' has at most fixed vertices. We first show that these fixed 
vertices form a subtree in T^. In fact, for any two fixed vertices vi and V2, they can 
only map to Vi and V2 among themselves, respectively. Thus, any (t>i, f2)-path maps to 
the (fi,f2)-path under any automorphism. So, all vertices in a (fi,f2)-path are fixed 
ones, that is, all the fixed vertices form a connected subgraph of T^. Consequently, the 
fixed vertices induce a subtree of and |T"| < ^n. If |T"| = 0, then the tree 
has a symmetrical edge. The structure of is determined by one half of the vertices 
in T^. Hence, the number of trees in Tn having a symmetrical edge is at most |7ii|, 

and so -p^ — )■ 0. 

I 'n\ 

Then, we always suppose \T"\ > 0. Let u be a vertex in T". Suppose that Hu is a 
subtree of attaching to u such that all the vertices in are not in T". Suppose 
there are m copies of after deleting u. We have m > 2, otherwise the vertex in 
connecting to u is also a fixed vertex, which is a contradiction. If m is even, we get rid 
off m/2 copies of Hu- If m is odd, we delete (m + l)/2 copies of Hu- We repeat this 
operation on all vertices in T^. At the end, this produces a new tree A with at most 
|(n + 1/24 ■ n) vertices, and we denote the set of these new trees by -^||„- Moreover, 
if we replace these [^J copies of by a vertex, that is, we add some vertices to u 
and different kinds of Hu correspond to different vertices. Thus, we construct another 
tree A'. Observe that T" is a subtree of A'. We shall show that A' has at most n/3 
vertices. Color the vertices in A' corresponding to by red and the others by green. 
We already knew that there are at most red vertices. Let u be a red vertex. If a 
green vertex connecting to u in A' represents just only one vertex of A, then there is 
only one such vertex connecting to u. One can see an example in Figure [TJ Hence, in 
A' there are at most vertices representing the subtrees having a single vertex and 
at most I ■ |n vertices representing the other subtrees (or forests) having at least two 
vertices. Consequently, we get that A' has at most n/3 vertices. Moreover, we need 
the fact that the order of in is asymptotically ^^°2 ■ So, the number of trees with at 
most n vertices is asymptotically less than 2 ■ Cxq 

In the above, we have built a map from 7^' to Ais^. Suppose A25^ is a tree in A25^. 

48 48 48 

Then |A25„| is at most ^n. The ways to choose k vertices to form a subtree of Ax^ 

' 48 " ' 4» 4g " 

25 

is at most (*|"). We color these vertices in A 25^ by red. Notice that any tree in 7^" 
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Figure 1: An example of A' and A. 



has at most vertices. Then, the number of subtrees in having at most 

Let T" be one of the subtrees with k vertices. The red vertices have been selected, 
we suppose that A is the corresponding tree defined as above. For u G T", each green 
vertex in N connecting to u corresponds to a kind of subtree and the number of 
Hu can be odd or even. That is, for each H^^ we have at most two choices to form 
a tree with n vertices. Since the number of green vertices is less than < n/3, 

there are less than n/3 vertices in A! but T^'. We can get that there exist at most 2t 
trees in 7^ mapping to A-&^. 

48 

Therefore, for trees in 7^" with k vertices, at most 2^ ■ 2C ■ Xq ■ (^") trees in 7^' 
map to them. Recall that each corresponds to some T'^. Then we have 



2C ■ Xn 



l/24n 

K\ < E 2^ 

<1^2t.2C-Xo 



(Ij n 

—n ■ 23 
12 



48"" 



48'^ 



,25 
48' 



48"" 



24 



By Stirling's approximation, i.e. 
is large enough. 



where Cn is a constant. Then, 



— )■ 1 as n — )■ cxD, we can get that when n 



< ^l.2^ 



where Ci is some real number for large n. It is known that |7^| 
xo ^ 0.3383219. Consequently, ^ ^ 0. 

I 'n\ 
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Recall that 
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Hence, in conclusion, we get that almost all trees do not belong to 7^'. Then, the 
result is relevant. □ 



Next, we proceed to estimate the number of rooted trees for a given tree from 
Theorem [3] and Lemma HI The following theorem is established. 

Theorem 5. For almost all trees in %i, the number of corresponding rooted trees is 
{Hr + o(l))n. 

Proof. By Lemma HI we know that almost every tree has at least fixed vertices, 
and denote these trees by T*. Clearly, T* ^Tn and |^ — )■ 1. Let T be a tree in T*. If 
we pick out one of the fixed vertices to be the root, we can get a rooted tree having the 
same number of vertex classes. There are at least -^n rooted trees in which the roots 
of the rooted trees correspond to the fixed vertices of T. And the number of vertex 
classes equals to that in T. Hence, there are at least \T* \ ■ rooted trees in 7^„, such 
that the roots are fixed vertices in the associated tree. These rooted trees form a set 
n*. Notice that |7^„| ~ and \T*\ ^ We get ^ 0. In conjunction 

with Theorem [3l we have that the number of vertex classes is + o(l))n for almost 
all rooted trees in 7^* . 

According to whether the number of vertex classes is (/i^ + o(l))n or not, we depart 

17?.* I 

7^* into two parts TZni and 71^2- There are at most i"'^ trees in T* corresponding 

to K,2- Since |7^;2| = o(|7^:|) = o(|7^J), then ^ = o{\X\) = o{\r:\). 

Therefore, almost all trees in J'* correspond to the rooted trees in 7^* ^. And recall 
that the root of the tree in 7^* is a fixed vertex. That is, almost all trees in T* also 
have {fir + o{l))n vertex classes. Consequently, almost every tree in Tn has (/ir + o(l))n 
vertex classes. The proof is complete. □ 

From Theorem [5l we have an intuitive grasp that the rooted tree space is just the 
tree space with a scale {fir + o{l))n. Not rigorously to say, if we consider any special 
structure in trees, the case that this structure will appear [fir + o{l))n times in rooted 
trees is in a large probability, and the probabilities of appearances in tree space and 
rooted tree space seem to be the same. Moreover, by the asymptotical values of \TZn\ 
and \Tn\, we can get that /ir ~ 0.8210. 

4 The distribution for any pattern in Tn 

In this section, we shall focus on the distribution of the occurrences Xn,M of a pattern 
Ai on tree space Tn- It is known that the distribution of the occurrences of a pattern in 
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TZn is asymptotically normal. We refer the readers to [B] . We show that the correspond- 
ing distribution in 7^ is also asymptotically normal. It has been shown that Xn^M{T) 
has mean {fiM + o(l))n and variance (o"a/ + o(l))n and for almost every tree, and the 
number of corresponding rooted trees is {fir + o(l))n. The constants fi and a are the 
same as those for the case of rooted trees, namely, E{Xn^M(T)) ~ fiu^ ~ E{Xn^M{R)) 
and Var{Xn^MiT)) ~ ~ Var{Xn^M{R))- Based on these two results, we proceed 
to get our final result. 

Theorem 6. For any given pattern, the number of occurrences of the pattern in trees 
is asymptotically normally distributed. 

Proof. Recall that for each given pattern A^, E{Xn^M(T)) ~ fiM^ and Var{Xn^M(T)) ~ 
an, where fiM and ctm are some constants. Let be the subset of Tn such that the 
number of occurrences Xn,M{T) satisfies that < t, where t is some real 

number. Then, the probability 

p . X^^m{T) - fiMU _ 



y/CTAin J \Tn\ 

where is the subset of 7^. For 7^„, we shall try to show that 

hm P ( ' ^-MR)-mn) ^\^^^p( Xn,MiT) - fiMU ^ 



We knew that 

hm P f^^^^^^i^^l^^ < t") = iV(0, 1, t), 

where A^(0, l,t) denotes the probability value of the normal distribution at t. Denote 
by TZl^ the set of rooted trees satisfying ^"'^^^^Li -^ < t. The last equation holds from 
the fact that any pattern in Tin is asymptotically normally distributed. 

If i? is a rooted tree in 71^ corresponding to T G 7^, then Xn^M^R) = Xn,M(T). So, 
a tree T is in 7^^ if and only if all the associated rooted trees are in nl. We depart 
into two subsets, T'l^ and T"^, one is the collection of trees corresponding to (/ir+o(l))n 
rooted trees, and the other is not, respectively. By Theorem |5l the number of rooted 
trees corresponding to T',\ is \T'n\-{ii{R)+o{l))n, and \T"li\ = o(| 7^^ |), i.e., the number 
of rooted trees associated with T"^ is at most o(|7^^|) ■ n. Then, it follows that 

\ri\ ■ i^r + o{l))n < \n'J < \ri\ ■ (/i. + o{l))n + o{\Xl\) ■ n. 

Since o{\T^\) ■ n = o(|7^l|) and ^ ~ 1, we have |7^^| = {fir + o{l))n ■ \Xl\. 
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Therefore, we get that 



Xn,M{R) - IJ'Mn 



y/oun ) Tin 



p . Xn,M{T) - flMU ^ ^ 



Consequently, 



hm P i^-'''^^^ - <t]= hm P P"'^^^^^ ~ ^^-^^ < t 



= iV(0,l,t). 

Then the variable Xn^M{T) is also asymptotically normal with mean ~ /iM'^ and vari- 
ance ~ aMn. □ 



Now, we have established that for any pattern, the limiting distribution of the 
number of occurrences in Tn is also normal, which solves an open question claimed in 

Hi- 
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